A Discriminative Global Training Algorithm for Statistical MT
نویسندگان
چکیده
This paper presents a novel training algorithm for a linearly-scored block sequence translation model. The key component is a new procedure to directly optimize the global scoring function used by a SMT decoder. No translation, language, or distortion model probabilities are used as in earlier work on SMT. Therefore our method, which employs less domain specific knowledge, is both simpler and more extensible than previous approaches. Moreover, the training procedure treats the decoder as a black-box, and thus can be used to optimize any decoding scheme. The training algorithm is evaluated on a standard Arabic-English translation task.
منابع مشابه
A Discriminative Syntactic Word Order Model for Machine Translation
We present a global discriminative statistical word order model for machine translation. Our model combines syntactic movement and surface movement information, and is discriminatively trained to choose among possible word orders. We show that combining discriminative training with features to detect these two different kinds of movement phenomena leads to substantial improvements in word order...
متن کاملScalable Purely-Discriminative Training for Word and Tree Transducers
Discriminative training methods have recently led to significant advances in the state of the art of machine translation (MT). Another promising trend is the incorporation of syntactic information into MT systems. Combining these trends is difficult for reasons of system complexity and computational complexity. The present study makes progress towards a syntax-aware MT system whose every compon...
متن کاملThe QMUL system description for IWSLT 2010
The QMUL submission to IWSLT 2010 is a phrase-based statistical MT system. A multi-stack, multi-beam decoder with several features, with weights tuned on the provided development data through Minimum Error Rate Training (MERT) algorithm. This year QMUL participated in ArabicEnglish, French-English and Turkish-English language pairs of the BTEC task. A discriminative reordering model is added as...
متن کاملSimulating Discriminative Training for Linear Mixture Adaptation in Statistical Machine Translation
Linear mixture models are a simple and effective technique for performing domain adaptation of translation models in statistical MT. In this paper, we identify and correct two weaknesses of this method. First, we show that standard maximumlikelihood weights are biased toward large corpora, and that a straightforward preprocessing step that down-samples phrase tables can be used to counter this ...
متن کاملHierarchical MT Training using Max-Violation Perceptron
Large-scale discriminative training has become promising for statistical machine translation by leveraging the huge training corpus; for example the recent effort in phrase-based MT (Yu et al., 2013) significantly outperforms mainstream methods that only train on small tuning sets. However, phrase-based MT suffers from limited reorderings, and thus its training can only utilize a small portion ...
متن کامل